VERSANT - SOFTWARE SOLUTIONS TO INTERNET PERFORMANCE PROBLEMS

SOFTWARE SOLUTIONS TO
INTERNET PERFORMANCE PROBLEMS

An Industry Advisory Paper
by
Versant Object Technology

INTRODUCTION

The Internet and the World Wide Web are among the most far-reaching publishing and communications systems in history. Their capacity to extend this functionality to include global transactioning capabilities holds the potential to revolutionize business the world over. But this potential is at risk as growth in Internet usage outstrips the capacity of existing infrastructure to support it. Specifically, increases in usage and growing complexity in content have dramatically reduced the responsiveness and even accessibility of much of the Web and have earned it the derisive epithet, World Wide Wait. As a result, more and more corporate users have begun building closed, private networks, portending a world of Balkanized systems with limited access, conflicting standards, diminished utility and, ultimately, reduced participation. Bummer.

Conventional approaches to the problem have been slow in coming, piecemeal in approach, and mainly focused on hardware and transmission-based solutions: faster modems, more Web servers, and an almost mystical faith in the imminent salvation of Bigger Pipes, like ATM. Yet all of these approaches have severe limitations ranging from being simply expensive and crippling to being beyond user control and even useless.

Software-based solutions, on the other hand, built on object technology, are available now, can be implemented to address system-wide limitations, cost a fraction of what hardware-based solutions do, exhibit a much greater range of scalability, and can greatly speed the performance of all components in a vastly distributed, hugely scalable network. The experience of the world's telephony industry in adopting these approaches to manage the world's telephone switching and transmission fabric is indicative of the maturity, scalability, performance and robustness of these solutions.

This paper is intended for two classes of readers: Web-site managers building large corporate Intranets; and Internet infrastructure providers: ISP's; large content publishers; Web-farm hosters; advanced development tools vendors, and the like. It examines the three most important sources of slow performance on the World Wide Web. It looks at conventional hardware- and transmission-based solutions and compares them with software-based alternatives.

SOURCES OF PERFORMANCE PROBLEMS

A systems view of the World Wide Web reveals three principal "choke points" at which throughput is bottlenecked and performance is degraded. These three problem points are illustrated in Figure 1. The perspective in Figure 1 and throughout this paper is that of a corporate Internet user though the problems and solutions apply equally well to individual users surfing the Web.

Choke Point Number 1: The Illusion of Bandwidth Constraint

The first choke point, User Bandwidth, occurs at the corporate firewall. Dozens to hundreds or even thousands of users access the Internet through a single gateway. This choke point converts LAN speeds, typically 10 Mbps for Ethernet, into a telephone line or leased line connection, commonly operating at 56 Kbps. As Internet usage increases, the average bandwidth available to any one of N number of users becomes simply the size of the pipe divided by N. As the net-connected user population grows, the conventional solution to this problem is to install more gateways thereby increasing hardware costs, diffusing control and raising maintenance overhead.

Ironically, it is not the 200-odd X "narrowing down" of the pipe from 10 Mbps to 56 Kbps or even the increase in the number of users which is the real source of the first performance bottleneck. Rather, it is the constant usage of the fixed pipe for issuing and fulfilling duplicate requests which degrades performance and response times. User studies indicate that as many as 90% of the requests going out from a corporate firewall are for information which has already been requested by other users in the organization. Even in organizations with as little as 60% commonality in requests for information, this represents the wastage of almost two thirds of available bandwidth. It's like sending someone down to Kinko's six separate times to make six copies of a letter.

In an object-based software solution, these duplicate requests are fulfilled once from across the firewall by navigating the Internet but are then cached at the server inside the firewall and used to fulfill additional requests. This is like having a copy machine in your office so you only have to go to Kinko's once, for a master copy, and can satisfy your subsequent copying needs from within. Using this approach, the organization with 90% common requests immediately experiences an effective 10X increase in available bandwidth, a 1,000% improvement, and users see a minimum of a 200X increase in performance for cached information. These metrics occur as trans-Internet requests are reduced by 90% and as the perceived "Net" responds at Ethernet speeds versus the 56 Kbps of the telephone line.

In fact, responsiveness is many times higher as each request does not have to traverse the total end-to-end circuit of the Internet, contending for scarce server resources on the other end, but is fulfilled out of local cache at memory speeds. Empirical tests show response times for such locally cached information to be in the range of 50-60 ms (milliseconds) versus as many as 5000 ms for routine requests requiring Internet-mediated fulfillment.

This software-based solution to perceived bandwidth constraints is illustrated in Figure 2.

In fact, responsiveness is many times higher as each request does not have to traverse the total end-to-end circuit of the Internet, contending for scarce server resources on the other end, but is fulfilled out of local cache at memory speeds. Empirical tests show response times for such locally cached information to be in the range of 20-30 ms (milliseconds) versus as many as 5,000 ms for routine requests requiring Internet-mediated fulfillment.

This software-based solution to perceived bandwidth constraints is illustrated in Figure 2.

Equally important to the performance of the larger Internet is the fact that as each user site implements such a system it reduces its burden on the larger Internet by the amount of its duplicate traffic--up to 90% in the above example--thus freeing up resources to sustain other users' growth at acceptable levels of performance and throughput. For the civic minded, this is the near-hallowed case where the pure pursuit of private gain leads directly to public virtue as the Internet "commons" is spared the burden of sustaining redundant traffic and can conserve capacity to satisfy unique requests. This product, built on Versant's object database and advanced caching technology, is available today from Spyglass and its 100+ OEMs as the Spyglass ProServer.

Choke Point Number 2: Connection and File Service Capacity

At the other end of the Internet cloud lies the Web server site (See Figure 1). This is the second major choke point affecting the performance of Internet and Web systems. The challenge for content publishers, Internet Service Providers, Web farms and related players is to deploy a limited set of resources to respond in a timely matter to an unpredictable, spiky and possibly torrential demand for services.

In this environment, response time is everything and depends on two straightforward but interdependent variables: how long it takes to get a connection; and, once secured, how long it takes for requests to be satisfied. The vast majority of content on the Web is currently stored in native file systems provided by the operating system of the host hardware platform. These systems are designed to address the most generic set of data storage problems but have never been optimized for concurrency or performance in a multi-user environment.

For example, a request for WWW/xxx/yyy/zzz/HTML to a Web site with 250,000 URL's (Universal Resource Locators) may require as many as 5,000 disk sector reads before the file system even gets to the requested file. These are among the most expensive operations in the entire computing process, consuming more time than almost any other activity. And then, once located, the system still has to go fetch the file from disk which may involve many separate I/O operations and finally, return the file to the user. Now, if the system requires 50 ms to service an average request it can support 20 connections per second (1,000/50) without beginning to queue requests and response times bogging down. But when, in this example, requests for service arrive at a rate greater than 20 per second, perceived response time from the user degrades and continues to degrade in a linear manner.

A simple example illustrates this point. At 40 requests per second the following occurs: At time 0 the first request arrives and processing begins. 25 ms later the next request arrives but the processor is still occupied with request 1. Please hold. Finally, at 50 ms, request 1 is satisfied and work begins on request 2 which has been waiting 25 ms and still needs another 50 ms to be completed. Request 3 has now also arrived and is put on hold until 100 ms when request 2 is completed. By now, request 4 and 5 have also arrived and are waiting. Note that while request 1 took 50 ms to be serviced, request 2 required 75 ms (25 ms of wait and 50 ms of processing time) and request 3 will need 100 ms (50 ms of wait and 50 ms of processing time). This behavior is illustrated in the accompanying Table 1. Note that within only 5 requests elapsed time to service them has tripled. The user has the feeling of swimming through peanut butter while his browser screen staggers like a drunken horse. In disgust, he quickly abandons the site.

Table 1: Impact of Queueing Delay on Service Response Time

Request Arrival Time Wait Time Service Time Completion Time Total Elapsed Time

1 0 0 50 50 50

2 25 25 50 100 75

3 50 50 50 150 100

4 75 75 50 200 125

5 100 100 50 250 150

The conventional solution to this problem is to throw more hardware at it. First, faster processors, then multi-processor machines, then multiple multiprocessor machines, then multiple RISC machines, then multiple...well, you get the idea. This is not only expensive, it also results in significant incremental management and maintenance overhead. And, it has the additionally distasteful characteristic of never ending, a source of unremitting woe to beleaguered Web site managers and a fact not lost on gleeful hardware vendors.

A software-based solution on the other hand, uses an object database as the repository for Web content and its hash table as a hot cache for maintaining directory listings. Instead of multiple disk I/O requests, Web objects are located with a single lookup, freeing up expensive operating system resources for more task-appropriate activities such as managing connection requests. This program has been specifically optimized for managing huge volumes of abstract data with high performance and high concurrency in very large, distributed, multi-user environments. It is the basis for some of the world's largest transactional billing systems, in some cases managing over 100 Gb of data, hundreds of thousands of user connections, and supporting over 200 million transactions a day with average response times below three seconds.

Empirical tests managing content and server access on large-scale Web sites reveal a minimum of 30X increase in throughput relative to native file systems. In one especially noteworthy case, a single processor, 75 MHz Pentium machine with 8 Mb of RAM (cost: under $2,000) sustained over 250 requests per second and serviced over 40 million requests a day, outperforming a $30,000 RISC machine from one of the world's largest (and newly chagrined) computer companies. This product, also built on Versant's object database and advanced caching technology, will soon be available for large Web sites, Internet Service Providers, Internet infrastructure vendors and major content publishers.

Choke Point Number 3: Transactional Access Time

Finally, the third choke point at which system performance is impacted is Transactional Access. Static and even dynamic publishing as discussed in the section above are relatively straightforward. An HTML page or a GIF file are requested from the server, located and returned to the user. End of connection. In transactional environments, however, a much more involved process typically occurs. In the most simple transaction, say where a person wants to buy a CD ROM, the system must access inventory records, check for availability, place a temporary hold on the item, confirm the user's credit worthiness, debit his account, debit the inventory for the requested item, notify shipping of the order and finally, confirm the transaction with the user.

The Glass House manner of supporting such transactions is to use a relational database and its SQL language interface. Relational technology was first invented in 1970 and was designed to provide ad hoc access to mainframes to query simple types of data and simple relationships between the data--a part number to an invoice, for example. It is brilliant at such simple, static, centralized and constrained applications. In the dynamic, wide open, distributed world of the Internet and World Wide Web, however, it is much less able to support the exploding needs of electronic commerce.

First, relational systems represent the world as a series of two dimensional tables with rows and columns. "Relations" between data are managed with indexes and foreign keys pointing to multiple other tables which are searched and "joined" together to render the results of simple queries. For a world of limited data sets, a fixed number of users, simple data types and simple data relationships this is quite competent. But as the number of users grow (and, in fact, become unpredictable), as data types become more abstract--audio, video, schematics, full motion graphics, etc.--and data relationships become more complex, the process quickly breaks down, imposing huge delays and performance burdens on Web sites.

Simply parsing the SQL query statement creates processing overhead. Then "joining" tables and culling through perhaps millions of unnecessary records to render the one data element needed exacts an enormous toll on performance. As the size of the data set and complexity in the data relationships (more tables) grow, more and more of the system's resources are consumed in this by-product activity until response times eventually go asymptotic to infinity. Leading relational databases use a standard 4K page size, blindly consuming 4K of core memory for even the smallest requested items. This significantly reduces the amount of information that can be cached and forces the system to repeatedly resort to I/O based service fulfillment. We're talking S-L-O-W. And, since these systems are artifacts of the mainframe world, they have no capacity to tune applications on the client side--clients in that era were 3270's--there was nothing to tune. Note that because of these inherent limitations, the religious wars in the relational world have come to center on which vendor runs on the most symmetrical multiprocessors--again, a hardware recourse which betrays the essential exhaustion of the software paradigm.

Contrast this with an object database environment. Since data is not represented in tables, there is no joining. (There is also no "mapping" of three dimensional data or relationships down onto two dimensional space. This process, while not impacting in-service performance directly, adds a huge amount of complexity to application development and renders many large relational applications relatively intractable compared to their object-based alternatives.) In object databases, data are accessed by direct navigation to the requested objects which can number in the hundreds of trillions (yes, that's a "t")! Direct hash lookups completely avoid the overhead of parsing SQL statements for every database access. Object-level granularity permits the server to utilize only the amount of memory needed for an individual object--if it's 400 bytes, it uses 400 bytes, not 4K. This greatly improves efficiency of in-memory operations.

And finally, the client side of applications can be "tuned" or optimized for the nature of the task at hand--static display, dynamic display, various types of transactioning, etc. The result? Blazingly fast performance. How fast? For complex applications involving many variables, a database manager at one of the U.S. government's national laboratories has stated publicly that, "Versant is 100 times faster than relational."

Imagine, for example, the following scenario which is being built today. A user logs onto a Web site and begins browsing. Within 15 seconds, the Web server has accessed third party databases to determine what other sites this user has visited in the past 30 days--what he's looked at, for how long, what he bought, etc. At the same time, the Web server has consulted with demographic and credit databases to understand the users' age, income, purchase patterns and creditworthiness. With this information, it turns to a set of advertisers, presents a profile of the user and auctions the rights to present an ad to the user. Should we show him footwear? Cowboy boots or sneakers? Sneakers? Nike or Reebok? What are we bid?

Within 30 seconds of the time the user logged on, the "transaction" is complete. The advertiser has presented a context-appropriate message to a perfectly targeted demographic segment of one. This is typical of what sophisticated vendors and many users are trying to do today with Web transactioning--not just offer simple point-to-point transactions, but correlate individual purchases with other purchases, with affinity group memberships, to offer third party incentives, to base promotions on real-time analysis of the user's demographic characteristics, to do real-time auctioning, arbitrage, and more.

And, as with the cases of User-side bandwidth and Connection and File Service Capacity discussed above, the software-based alternative using object database technology not only far-outperforms older hardware-centric solutions, it is being used today to build the most advanced applications in the world.

Table 2, below summarizes the sources and alternate solutions to these problems of Internet performance.

Table 2: Sources and Solutions for Internet Performance Problems

User-side "bandwidth" Static content access Transactional access

Symptoms
Slow refresh and download times

Unable to log in
Slow request fulfillment

Long waits for transaction fulfillment

Conventional
Solution
Higher speed modems
More gateways
Advanced transmission infrastructures (ISDN, ATM)

More web servers
Shallower directory hierarchies
Wait for ORDBMS

Even-larger SMP processors
Simplified query models
More hardware

Problem
Huge cost
Reduced control
Increased maintenance
No ability to impact capacity

Expensive
Increased site maintenance burden
Constant reconfiguration of Web site structure

Expensive equipment solution to software problem
Hobbled business systems that don't meet business objectives

Object Solution
Common hits cached at corporate firewall
Requests fulfilled at LAN speeds

Common requests cached for instant vending
Native object-based storage and retrieval systems

ODBMS provides direct navigation to transactional objects
Coexistence puts appropriate technology to work

CONCLUSION

Those deploying large Web sites or building core components of the Internet infrastructure must confront the limitations of hardware-based solutions to performance problems. Increased usage, an explosion in the density of content, and the emergence of increasingly complex transactions reinforce one another to the effect of driving Internet performance into the ground. We are on the verge of a pernicious downward spiral into paralysis and possible abandonment of the Internet and Web as open, high performance computing, communications and transactioning mediums. We call this "Netereal Sclerosis".

If these mediums are not to become the victims of their own success, the industry will have to adopt different approaches to addressing these problems. Hardware-based solutions, besides being too expensive, cannot keep up with the exploding burden. Software-based solutions, on the other hand, especially those based on object database technology, not only address these problems in a much more elegant and elastic manner, they have proven themselves in the hardest environment on earth, the world's telephony industry.

As supplier of the world's most widely deployed object database management system for high performance, distributed Infrastructure applications, Versant appreciates your interest in this issue and looks forward to engaging the industry as it develops solutions for future Internet and Web-based applications.

1-800-VERSANT
Tel 415-329-7500
Fax 415-325-2380
e-mail info@versant.com